Search results for "natural language"
showing 10 items of 650 documents
On the empirical spectral distribution for certain models related to sample covariance matrices with different correlations
2021
Given [Formula: see text], we study two classes of large random matrices of the form [Formula: see text] where for every [Formula: see text], [Formula: see text] are iid copies of a random variable [Formula: see text], [Formula: see text], [Formula: see text] are two (not necessarily independent) sets of independent random vectors having different covariance matrices and generating well concentrated bilinear forms. We consider two main asymptotic regimes as [Formula: see text]: a standard one, where [Formula: see text], and a slightly modified one, where [Formula: see text] and [Formula: see text] while [Formula: see text] for some [Formula: see text]. Assuming that vectors [Formula: see t…
UML Style Graphical Notation and Editor for OWL 2
2010
OWL is becoming the most widely used knowledge representation language. It has several textual notations but no standard graphical notation apart from verbose ODM UML. We propose an extension to UML class diagrams (heavyweight extension) that allows a compact OWL visualization. The compactness is achieved through the native power of UML class diagrams extended with optional Manchester encoding for class expressions thus largely eliminating the need for explicit anonymous class visualization. To use UML class diagram notation we had to modify its semantics to support Open World Assumption that is central to OWL. We have implemented the proposed compact visualization for OWL 2 in a UML style …
Bayesian Modelling of Confusability of Phoneme-Grapheme Connections
2007
Deficiencies in the ability to map letters to sounds are currently considered to be the most likely early signs of dyslexia. This has motivated the use of Literate, a computer game for training this skill, in several Finnish schools and households as a tool in the early prevention of reading disability. In this paper, we present a Bayesian model that uses a student's performance in a game like Literate to infer which phoneme-grapheme connections student currently confuses with each other. This information can be used to adapt the game to a particular student's skills as well as to provide information about the student's learning progress to their parents and teachers. We apply our model to …
User experience-based information retrieval from semistar data ontologies
2019
The time necessary for the doubling of medical knowledge is rapidly decreasing. In such circumstances, it is of utmost importance for the information retrieval process to be rapid, convenient and straightforward. However, it often lacks at least one of these properties. Several obstacles prohibit domain experts extracting knowledge from their databases without involving the third party in the form of IT professionals. The main limitation is usually the complexity of querying languages and tools. This paper proposes the approach of using a keywords-containing natural language for querying the database and exploiting the system that could automatically translate such queries to already existi…
Probabilities to Accept Languages by Quantum Finite Automata
1999
We construct a hierarchy of regular languages such that the current language in the hierarchy can be accepted by 1-way quantum finite automata with a probability smaller than the corresponding probability for the preceding language in the hierarchy. These probabilities converge to 1/2.
Robust Neural Machine Translation: Modeling Orthographic and Interpunctual Variation
2020
Neural machine translation systems typically are trained on curated corpora and break when faced with non-standard orthography or punctuation. Resilience to spelling mistakes and typos, however, is crucial as machine translation systems are used to translate texts of informal origins, such as chat conversations, social media posts and web pages. We propose a simple generative noise model to generate adversarial examples of ten different types. We use these to augment machine translation systems’ training data and show that, when tested on noisy data, systems trained using adversarial examples perform almost as well as when translating clean data, while baseline systems’ performance drops by…
SisHiTra : A Hybrid Machine Translation System from Spanish to Catalan
2004
In the current European scenario, characterized by the coexistence of communities writing and speaking a great variety of languages, machine translation has become a technology of capital importance. In areas of Spain and of other countries, coofficiality of several languages implies producing several versions of public information. Machine translation between all the languages of the Iberian Peninsula and from them into English will allow for a better integration of Iberian linguistic communities among them and inside Europe. The purpose of this paper is to show a machine translation system from Spanish to Catalan that deals with text input. In our approach, both deductive (linguistic) and…
Prosodic phenomena in simultaneous interpreting
2005
This paper reports on an empirical study on prosody in English-German simultaneous interpreting. It discusses prosody with particular reference to its tonal, durational and dynamic features, such as intonation, pauses, rhythm and accent, as well as its main functions, i.e. structure and prominence. Following a review of previous studies on the topic, a conceptual approach for the analysis of prosody in terms of structure and prominence is developed and subsequently applied to an authentic corpus of professional simultaneous interpretation consisting of three German versions of a 72-minute English source text. Prosodic patterns in the corpus are analyzed by means of a computer-aided method u…
ON-LINE CONSTRUCTION OF A SMALL AUTOMATON FOR A FINITE SET OF WORDS
2012
In this paper we describe a "light" algorithm for the on-line construction of a small automaton recognising a finite set of words. The algorithm runs in linear time. We carried out good experimental results on real dictionaries, on biological sequences and on the sets of suffixes (resp. factors) of a set of words that shows how our automaton is near to the minimal one. For the suffixes of a text, we propose a modified construction that leads to an even smaller automaton. We moreover construct linear algorithms for the insertion and deletion of a word in a finite set, directly from the constructed automaton.
Register Variation Across English Pharmaceutical Texts: A Corpus-driven Study of Keywords, Lexical Bundles and Phrase Frames in Patient Information L…
2013
Abstract This study constitutes an initial step towards filling a gap in corpus linguistics studies of linguistic and phraseological variation across English pharmaceutical texts, in particular in terms of recurrent linguistic patterns. The study conducted from a register- perspective ( Biber & Conrad, 2009 ), which employs both quantitative and qualitative research procedures, aims to provide a corpus-driven description of vocabulary and phraseology, namely key words, lexical bundles, and phrase frames, used in patient information leaflets and summaries of product characteristics (represented by 463 and 146 texts, respectively) written originally in English and collected in two domain-spec…